Graphs are ubiquitous in nature and can therefore serve as models for many practical but also theoretical problems. For this purpose, they can be defined as many different types which suitably reflect the individual contexts of the represented problem. To address cutting-edge problems based on graph data, the research field of Graph Neural Networks (GNNs) has emerged. Despite the field's youth and the speed at which new models are developed, many recent surveys have been published to keep track of them. Nevertheless, it has not yet been gathered which GNN can process what kind of graph types. In this survey, we give a detailed overview of already existing GNNs and, unlike previous surveys, categorize them according to their ability to handle different graph types and properties. We consider GNNs operating on static and dynamic graphs of different structural constitutions, with or without node or edge attributes. Moreover, we distinguish between GNN models for discrete-time or continuous-time dynamic graphs and group the models according to their architecture. We find that there are still graph types that are not or only rarely covered by existing GNN models. We point out where models are missing and give potential reasons for their absence.
translated by 谷歌翻译
Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings. In most cases, the estimated distribution sums to 1 over all finite strings. However, in some pathological cases, probability mass can ``leak'' onto the set of infinite sequences. In order to characterize the notion of leakage more precisely, this paper offers a measure-theoretic treatment of language modeling. We prove that many popular language model families are in fact tight, meaning that they will not leak in this sense. We also generalize characterizations of tightness proposed in previous works.
translated by 谷歌翻译
After just a few hundred training updates, a standard probabilistic model for language generation has likely not yet learnt many semantic or syntactic rules of natural language, which inherently makes it difficult to estimate the right probability distribution over next tokens. Yet around this point, these models have identified a simple, loss-minimising behaviour: to output the unigram distribution of the target training corpus. The use of such a crude heuristic raises the question: Rather than wasting precious compute resources and model capacity for learning this strategy at early training stages, can we initialise our models with this behaviour? Here, we show that we can effectively endow our model with a separate module that reflects unigram frequency statistics as prior knowledge. Standard neural language generation architectures offer a natural opportunity for implementing this idea: by initialising the bias term in a model's final linear layer with the log-unigram distribution. Experiments in neural machine translation demonstrate that this simple technique: (i) improves learning efficiency; (ii) achieves better overall performance; and (iii) appears to disentangle strong frequency effects, encouraging the model to specialise in non-frequency-related aspects of language.
translated by 谷歌翻译
Recent developments of advanced driver-assistance systems necessitate an increasing number of tests to validate new technologies. These tests cannot be carried out on track in a reasonable amount of time and automotive groups rely on simulators to perform most tests. The reliability of these simulators for constantly refined tasks is becoming an issue and, to increase the number of tests, the industry is now developing surrogate models, that should mimic the behavior of the simulator while being much faster to run on specific tasks. In this paper we aim to construct a surrogate model to mimic and replace the simulator. We first test several classical methods such as random forests, ridge regression or convolutional neural networks. Then we build three hybrid models that use all these methods and combine them to obtain an efficient hybrid surrogate model.
translated by 谷歌翻译
The material science literature contains up-to-date and comprehensive scientific knowledge of materials. However, their content is unstructured and diverse, resulting in a significant gap in providing sufficient information for material design and synthesis. To this end, we used natural language processing (NLP) and computer vision (CV) techniques based on convolutional neural networks (CNN) to discover valuable experimental-based information about nanomaterials and synthesis methods in energy-material-related publications. Our first system, TextMaster, extracts opinions from texts and classifies them into challenges and opportunities, achieving 94% and 92% accuracy, respectively. Our second system, GraphMaster, realizes data extraction of tables and figures from publications with 98.3\% classification accuracy and 4.3% data extraction mean square error. Our results show that these systems could assess the suitability of materials for a certain application by evaluation of synthesis insights and case analysis with detailed references. This work offers a fresh perspective on mining knowledge from scientific literature, providing a wide swatch to accelerate nanomaterial research through CNN.
translated by 谷歌翻译
卵巢癌是最致命的妇科恶性肿瘤。该疾病在早期阶段最常是无症状的,其诊断依赖于经阴道超声图像的专家评估。超声是表征附加质量的一线成像方式,它需要大量的专业知识,其分析是主观的和劳动的,因此易于误差。因此,在临床实践中需要进行自动化的过程,以促进和标准化扫描评估。使用监督的学习,我们证明了附加质量的分割是可能的,但是,患病率和标签不平衡限制了代表性不足的类别的性能。为了减轻这种情况,我们应用了一种新颖的病理学数据合成器。我们通过使用Poisson图像编辑将较少常见的质量整合到其他样品中,从而创建及其相应的地面真实分割的合成医学图像。我们的方法在所有班级中都取得了最佳性能,包括与NNU-NET基线方法相比,提高了多达8%。
translated by 谷歌翻译
这项工作的目的是通过根据求职者的简历提供无偏见的工作建议来帮助减轻已经存在的性别工资差距。我们采用生成的对抗网络来从12m职位空缺文本和900k简历的Word2VEC表示中删除性别偏见。我们的结果表明,由招聘文本创建的表示形式包含算法偏见,并且这种偏见会对推荐系统产生实际后果。在没有控制偏见的情况下,建议妇女在我们的数据中薪水明显降低。有了对手公平的代表,这种工资差距消失了,这意味着我们的辩护工作建议减少了工资歧视。我们得出的结论是,单词表示形式的对抗性偏见可以增加系统的真实世界公平性,因此可能是创建公平感知推荐系统的解决方案的一部分。
translated by 谷歌翻译
临床记录经常包括对患者特征的评估,其中可能包括完成各种问卷。这些问卷提供了有关患者当前健康状况的各种观点。捕获这些观点给出的异质性不仅至关重要,而且对开发具有成本效益的技术的临床表型技术的需求增长。填写许多问卷可能是患者的压力,因此昂贵。在这项工作中,我们提出了钴 - 一种基于成本的层选择器模型,用于使用社区检测方法检测表型。我们的目标是最大程度地减少用于构建这些表型的功能的数量,同时保持其质量。我们使用来自慢性耳鸣患者的问卷数据测试我们的模型,并在多层网络结构中代表数据。然后,通过使用基线特征(年龄,性别和治疗前数据)以及确定的表型作为特征来评估该模型。对于某些治疗后变量,使用来自钴的表型作为特征的预测因素优于使用传统聚类方法检测到的表型的预测因素。此外,与仅接受基线特征训练的预测因子相比,使用表型数据预测治疗后数据被证明是有益的。
translated by 谷歌翻译
相机陷阱彻底改变了许多物种的动物研究,这些物种以前由于其栖息地或行为而几乎无法观察到。它们通常是固定在触发时拍摄短序列图像的树上的相机。深度学习有可能克服工作量以根据分类单元或空图像自动化图像分类。但是,标准的深神经网络分类器失败,因为动物通常代表了高清图像的一小部分。这就是为什么我们提出一个名为“弱对象检测”的工作流程,以更快的速度rcnn+fpn适合这一挑战。该模型受到弱监督,因为它仅需要每个图像的动物分类量标签,但不需要任何手动边界框注释。首先,它会使用来自多个帧的运动自动执行弱监督的边界框注释。然后,它使用此薄弱的监督训练更快的RCNN+FPN模型。来自巴布亚新几内亚和密苏里州生物多样性监测活动的两个数据集获得了实验结果,然后在易于重复的测试台上获得了实验结果。
translated by 谷歌翻译
短单链RNA和DNA序列(适体)的逆设计是找到满足一组所需标准的序列的任务。相关标准可能是特定折叠基序的存在,与分子配体,传感属性等结合。适体设计的大多数实用方法都使用高通量实验(例如SELEX)和SELEX)和然后,仅通过对经验发现的候选人引入较小的修改来优化性能。具有所需特性但在化学成分上截然不同的序列将为搜索空间增加多样性,并促进发现有用的核酸适体。需要系统的多元化协议。在这里,我们建议使用一种无​​监督的机器学习模型,称为Potts模型,以发现具有可控序列多样性的新的有用序列。我们首先使用最大熵原理训练POTTS模型,这是一组由公共特征统一的经验鉴定的序列。为了生成具有可控多样性程度的新候选序列,我们利用了模型的光谱特征:能量带隙分离序列,与训练集相似,与训练集相似。通过控制采样的POTTS能量范围,我们生成的序列与训练集不同,但仍然可能具有编码功能。为了证明性能,我们将方法应用于设计不同的序列池,该序列具有30-MER RNA和DNA适体中指定的二级结构基序。
translated by 谷歌翻译